workload-replay: require the parser by default for query literals#36799
Closed
jasonhernandez wants to merge 2 commits into
Closed
workload-replay: require the parser by default for query literals#36799jasonhernandez wants to merge 2 commits into
jasonhernandez wants to merge 2 commits into
Conversation
Previously, if the mz-sql-anonymize binary was missing or a statement failed to parse, the anonymizer silently fell back to a regex that only redacts single-quoted strings — leaving numbers, dollar-quoted strings, and comments in query SQL exposed. For a privacy tool, fail-open is the wrong default. Add --require-parser (default on). When set, the tool errors rather than emit weaker output if the parser binary is unavailable or any captured query does not parse. --no-require-parser opts back into the regex fallback, which now warns that it is in use. Also resolve the output target up front so an invalid invocation fails before any work, independent of the new parser check. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per-statement parse failures are a property of the captured SQL, not of whether the parser is present, so they fall back to the regex with a warning in both modes (the verify pass still scans them). --require-parser now errors only when the parser binary itself is unavailable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
078962f to
f2a7a8b
Compare
This was referenced May 29, 2026
Contributor
Author
|
Superseded by #36803, which squashes this stack into a single PR against main. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fifth in the stack — base
workload-anonymize-subsource-keys(#36797).Change
Flips the anonymizer to fail-closed on literal redaction. Previously, if the
mz-sql-anonymizebinary was missing (or an individual captured statement didn't parse), the tool silently fell back to a regex that only redacts single-quoted strings — leaving numbers, dollar-quoted strings, and comments in query SQL exposed, with just a warning. For a privacy tool, fail-open is the wrong default.New
--require-parserflag, on by default:--no-require-parseropts back into the regex fallback (which now warns it's active).Also moved output-target resolution to the top of
main()so an invalid invocation (no -o, no--in-place) fails immediately, independent of the parser check.Behavior summary
--no-require-parserTesting
test_require_parser_errors_without_binarycovers the fail-closed default; the existing regex-path tests now pass--no-require-parservia therun_toolhelper (which defaults to it so tests are deterministic whether or not the binary is built).bin/fmt,ruffclean.🤖 Generated with Claude Code